Compressed Multirow Storage Format for Sparse Matrices on Graphics Processing Units
نویسندگان
چکیده
منابع مشابه
Compressed Multirow Storage Format for Sparse Matrices on Graphics Processing Units
A new format for storing sparse matrices is proposed for efficient sparse matrix-vector (SpMV) product calculation on modern graphics processing units (GPUs). This format extends the standard compressed row storage (CRS) format and can be quickly converted to and from it. Computational performance of two SpMV kernels for the new format is determined for over 130 sparse matrices on Fermi-class a...
متن کاملAn Alternative Compressed Storage Format for Sparse Matrices
The handling of the sparse matrix vector product(SMVP) is a common kernel in many scientific applications. This kernel is an irregular problem, which has led to the development of several compressed storage formats such as CRS, CCS, and JDS among others. We propose an alternative storage format, the Transpose Jagged Diagonal Storage(TJDS), which is inspired from the Jagged Diagonal Storage form...
متن کاملVectorized Sparse Matrix Multiply for Compressed Row Storage Format
The innovation of this work is a simple vectorizable algorithm for performing sparse matrix vector multiply in compressed sparse row (CSR) storage format. Unlike the vectorizable jagged diagonal format (JAD), this algorithm requires no data rearrangement and can be easily adapted to a sophisticated library framework such as PETSc. Numerical experiments on the Cray X1 show an order of magnitude ...
متن کاملMultifrontal Sparse Matrix Factorization on Graphics Processing Units
For many finite element problems, when represented as sparse matrices, iterative solvers are found to be unreliable because they can impose computational bottlenecks. Early pioneering work by Duff et al, explored an alternative strategy called multifrontal sparse matrix factorization. This approach, by representing the sparse problem as a tree of dense systems, maps well to modern memory hierar...
متن کاملCofactorization on Graphics Processing Units
We show how the cofactorization step, a compute-intensive part of the relation collection phase of the number field sieve (NFS), can be farmed out to a graphics processing unit. Our implementation on a GTX 580 GPU, which is integrated with a state-of-the-art NFS implementation, can serve as a cryptanalytic co-processor for several Intel i7-3770K quad-core CPUs simultaneously. This allows those ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Scientific Computing
سال: 2014
ISSN: 1064-8275,1095-7197
DOI: 10.1137/120900216